An ensemble biclustering approach for querying gene expression compendia with experimental lists

نویسندگان

  • Riet De Smet
  • Kathleen Marchal
چکیده

MOTIVATION Query-based biclustering techniques allow interrogating a gene expression compendium with a given gene or gene list. They do so by searching for genes in the compendium that have a profile close to the average expression profile of the genes in this query-list. As it can often not be guaranteed that the genes in a long query-list will all be mutually coexpressed, it is advisable to use each gene separately as a query. This approach, however, leaves the user with a tedious post-processing of partially redundant biclustering results. The fact that for each query-gene multiple parameter settings need to be tested in order to detect the 'most optimal bicluster size' adds to the redundancy problem. RESULTS To aid with this post-processing, we developed an ensemble approach to be used in combination with query-based biclustering. The method relies on a specifically designed consensus matrix in which the biclustering outcomes for multiple query-genes and for different possible parameter settings are merged in a statistically robust way. Clustering of this matrix results in distinct, non-redundant consensus biclusters that maximally reflect the information contained within the original query-based biclustering results. The usefulness of the developed approach is illustrated on a biological case study in Escherichia coli. AVAILABILITY AND IMPLEMENTATION Compiled Matlab code is available from http://homes.esat.kuleuven.be/~kmarchal/Supplementary_Information_DeSmet_2011/.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

به کارگیری خوشه‌بندی دوبعدی با روش «زیرماتریس‌های با میانگین- درایه‌های بزرگ» در داده‌های بیان ژنی حاصل از ریزآرایه‌های DNA

Background and Objective: In recent years, DNA microarray technology has become a central tool in genomic research. Using this technology, which made it possible to simultaneously analyze expression levels for thousands of genes under different conditions, massive amounts of information will be obtained. While traditional clustering methods, such as hierarchical and K-means clustering have been...

متن کامل

Query-driven module discovery in microarray data

MOTIVATION Existing (bi)clustering methods for microarray data analysis often do not answer the specific questions of interest to a biologist. Such specific questions could be derived from other information sources, including expert prior knowledge. More specifically, given a set of seed genes which are believed to have a common function, we would like to recruit genes with similar expression p...

متن کامل

Biclustering of gene expression data

Biclustering is an important problem that arises in diverse applications, including the analysis of gene expression and drug interaction data. A large number of clustering approaches have been proposed for gene expression data obtained from microarray experiments. However, the results from the application of standard clustering methods to genes are limited. This limitation is imposed by the exi...

متن کامل

Biclustering of Gene Expression Using Glowworm Swarm Optimization and Neuro-Fuzzy Discriminant Analysis

-The advent of DNA microarray technologies has revolutionized the experimental study of gene expression. Biclustering is the most popular approach of analyzing gene expression data and has indeed proven to be successful in many applications. In recent years, several biclustering methods have been suggested to identify local patterns in gene expression data. Most of these algorithms represent gr...

متن کامل

BIDENS: Iterative Density Based Biclustering Algorithm With Application to Gene Expression Analysis

Biclustering is a very useful data mining technique for identifying patterns where different genes are co-related based on a subset of conditions in gene expression analysis. Association rules mining is an efficient approach to achieve biclustering as in BIMODULE algorithm but it is sensitive to the value given to its input parameters and the discretization procedure used in the preprocessing s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 27 14  شماره 

صفحات  -

تاریخ انتشار 2011